AITopics | comparison system

Collaborating Authors

comparison system

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

b30958093daeed059670b35173654dc9-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 21:46:19 GMT

comparison system, convergence, q-learning, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > Canada (0.04)
(2 more...)

Genre: Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

b30958093daeed059670b35173654dc9-Supplemental.pdf

Neural Information Processing SystemsAug-15-2025, 21:30:19 GMT

comparison system, convergence, q-learning, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Genre: Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Unified Switching System Perspective and Convergence Analysis of Q-Learning Algorithms

Neural Information Processing SystemsAug-15-2025, 21:30:11 GMT

However, its application to Q-learning has been limited due to the presence of the max-operator, which makes the associated ODE model a complex nonlinear system. In contrast, the associated ODE of TD learning for policy evaluation is a linear system, whose asymptotic stability is much easier to analyze in general.

algorithm, convergence, q-learning, (10 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > Canada (0.04)
(2 more...)

Genre: Overview (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

BOOKCOREF: Coreference Resolution at Book Scale

Martinelli, Giuliano, Bonomo, Tommaso, Cabot, Pere-Lluís Huguet, Navigli, Roberto

arXiv.org Artificial IntelligenceJul-17-2025

Coreference Resolution systems are typically evaluated on benchmarks containing small- to medium-scale documents. When it comes to evaluating long texts, however, existing benchmarks, such as LitBank, remain limited in length and do not adequately assess system capabilities at the book scale, i.e., when co-referring mentions span hundreds of thousands of tokens. To fill this gap, we first put forward a novel automatic pipeline that produces high-quality Coreference Resolution annotations on full narrative texts. Then, we adopt this pipeline to create the first book-scale coreference benchmark, BOOKCOREF, with an average document length of more than 200,000 tokens. We carry out a series of experiments showing the robustness of our automatic procedure and demonstrating the value of our resource, which enables current long-document coreference systems to gain up to +20 CoNLL-F1 points when evaluated on full books. Moreover, we report on the new challenges introduced by this unprecedented book-scale setting, highlighting that current models fail to deliver the same performance they achieve on smaller documents. We release our data and code to encourage research and development of new book-scale Coreference Resolution systems at https://github.com/sapienzanlp/bookcoref.

computational linguistic, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2507.12075

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.93)
North America > Canada (0.68)

Genre: Research Report > New Finding (0.68)

Industry: Media > Publishing (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

Deep Q-Learning with Gradient Target Tracking

Lee, Donghwan, Park, Bum Geun, Lee, Taeho

arXiv.org Artificial IntelligenceMar-20-2025

This paper introduces Q-learning with gradient target tracking, a novel reinforcement learning framework that provides a learned continuous target update mechanism as an alternative to the conventional hard update paradigm. In the standard deep Q-network (DQN), the target network is a copy of the online network's weights, held fixed for a number of iterations before being periodically replaced via a hard update. While this stabilizes training by providing consistent targets, it introduces a new challenge: the hard update period must be carefully tuned to achieve optimal performance. To address this issue, we propose two gradient-based target update methods: DQN with asymmetric gradient target tracking (AGT2-DQN) and DQN with symmetric gradient target tracking (SGT2-DQN). These methods replace the conventional hard target updates with continuous and structured updates using gradient descent, which effectively eliminates the need for manual tuning. We provide a theoretical analysis proving the convergence of these methods in tabular settings. Additionally, empirical evaluations demonstrate their advantages over standard DQN baselines, which suggest that gradient-based target updates can serve as an effective alternative to conventional target update mechanisms in Q-learning.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2503.167

Country:

North America > United States (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Finite-Time Error Analysis of Soft Q-Learning: Switching System Approach

Jeong, Narim, Lee, Donghwan

arXiv.org Artificial IntelligenceJun-17-2024

Soft Q-learning is a variation of Q-learning designed to solve entropy regularized Markov decision problems where an agent aims to maximize the entropy regularized value function. Despite its empirical success, there have been limited theoretical studies of soft Q-learning to date. This paper aims to offer a novel and unified finite-time, control-theoretic analysis of soft Q-learning algorithms. We focus on two types of soft Q-learning algorithms: one utilizing the log-sum-exp operator and the other employing the Boltzmann operator. By using dynamical switching system models, we derive novel finite-time error bounds for both soft Q-learning algorithms. We hope that our analysis will deepen the current understanding of soft Q-learning by establishing connections with switching system models and may even pave the way for new frameworks in the finite-time analysis of other reinforcement learning algorithms.

comparison system, q-learning, soft q-learning, (16 more...)

arXiv.org Artificial Intelligence

2403.06366

Country:

Asia > Middle East > Jordan (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Finite-Time Analysis of Simultaneous Double Q-learning

Na, Hyunjun, Lee, Donghwan

arXiv.org Artificial IntelligenceJun-14-2024

Q-learning is one of the most fundamental reinforcement learning (RL) algorithms. Despite its widespread success in various applications, it is prone to overestimation bias in the Q-learning update. To address this issue, double Q-learning employs two independent Q-estimators which are randomly selected and updated during the learning process. This paper proposes a modified double Q-learning, called simultaneous double Q-learning (SDQ), with its finite-time analysis. SDQ eliminates the need for random selection between the two Q-estimators, and this modification allows us to analyze double Q-learning through the lens of a novel switching system framework facilitating efficient finite-time analysis. Empirical studies demonstrate that SDQ converges faster than double Q-learning while retaining the ability to mitigate the maximization bias. Finally, we derive a finite-time expected error bound for SDQ.

comparison system, double q-learning, lower comparison system, (16 more...)

arXiv.org Artificial Intelligence

2406.09946

Country: Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A finite time analysis of distributed Q-learning

Lim, Han-Dong, Lee, Donghwan

arXiv.org Artificial IntelligenceMay-22-2024

Multi-agent reinforcement learning (MARL) has witnessed a remarkable surge in interest, fueled by the empirical success achieved in applications of single-agent reinforcement learning (RL). In this study, we consider a distributed Q-learning scenario, wherein a number of agents cooperatively solve a sequential decision making problem without access to the central reward function which is an average of the local rewards.

avg, inequality follow, observation model, (14 more...)

arXiv.org Artificial Intelligence

2405.14078

Country:

North America > Costa Rica > Heredia Province > Heredia (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

Add feedback

Regularized Q-learning

Lim, Han-Dong, Lee, Donghwan

arXiv.org Artificial IntelligenceMay-13-2024

Q-learning is widely used algorithm in reinforcement learning (RL) community. Under the lookup table setting, its convergence is well established. However, its behavior is known to be unstable with the linear function approximation case. This paper develops a new Q-learning algorithm, called RegQ, that converges when linear function approximation is used. We prove that simply adding an appropriate regularization term ensures convergence of the algorithm. Its stability is established using a recent analysis tool based on switching system models. Moreover, we experimentally show that RegQ converges in environments where Q-learning with linear function approximation has known to diverge. An error bound on the solution where the algorithm converges is also given.

algorithm, function approximation, q-learning, (13 more...)

arXiv.org Artificial Intelligence

2202.05404

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Finite-Time Analysis of Minimax Q-Learning for Two-Player Zero-Sum Markov Games: Switching System Approach

Lee, Donghwan

arXiv.org Artificial IntelligenceJun-12-2023

The objective of this paper is to investigate the finite-time analysis of a Q-learning algorithm applied to two-player zero-sum Markov games. Specifically, we establish a finite-time analysis of both the minimax Q-learning algorithm and the corresponding value iteration method. To enhance the analysis of both value iteration and Q-learning, we employ the switching system model of minimax Q-learning and the associated value iteration. This approach provides further insights into minimax Q-learning and facilitates a more straightforward and insightful convergence analysis. We anticipate that the introduction of these additional insights has the potential to uncover novel connections and foster collaboration between concepts in the fields of control theory and reinforcement learning communities.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2306.057

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre:

Overview (0.67)
Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback